SemanticScuttle - klotz.me » klotz: machine learning+llm+transformer

klotz: machine learning* + llm* + transformer*

Bookmarks on this page are managed by an admin user.

A Complete Guide to BERT with Code: History, Architecture, Pre-training, and Fine-tuning This bookmark is certified by an admin user.

In this article, we will explore various aspects of BERT, including the landscape at the time of its creation, a detailed breakdown of the model architecture, and writing a task-agnostic fine-tuning pipeline, which we demonstrated using sentiment analysis. Despite being one of the earliest LLMs, BERT has remained relevant even today, and continues to find applications in both research and industry.

2024-05-28 Tags: bert, llm, embedding, google, nlp, encoder-only, transformer by klotz

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention This bookmark is certified by an admin user.

This paper introduces Cross-Layer Attention (CLA), an extension of Multi-Query Attention (MQA) and Grouped-Query Attention (GQA) for reducing the size of the key-value cache in transformer-based autoregressive large language models (LLMs). The authors demonstrate that CLA can reduce the cache size by another 2x while maintaining nearly the same accuracy as unmodified MQA, enabling inference with longer sequence lengths and larger batch sizes.

2024-05-26 Tags: transformer, autoregressive language models, key-value cache, attention, multiquery attention, cross-layer attention, machine learning, computer science, llm, mit, csail by klotz

Transformer architecture: This bookmark is certified by an admin user.

2023-11-14 Tags: llm, transformer, bert by klotz

Google AI Open-Sources Flan-T5: A Transformer-Based Language Model That Uses A Text-To-Text Approach For NLP Tasks - MarkTechPost This bookmark is certified by an admin user.

2023-02-07 Tags: google, flan, t5, machine learning, transformer, llm by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: machine learning* + llm* + transformer*

Linked Tags

Related Tags